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IN THE CLAIMS: 

Claims 1, 40, 50, 60, 69 and 70 have been amended, as follows: 

1 . (currently amended) An apparatus for converting an input voice signal into an 
output voice signal according to a target voice signal, the apparatus comprising: 

an input device that provides the input voice signal composed of an original 
sinusoidal component and an original residual component other than the original 
sinusoidal component; 

an extracting device that extracts original attribute data from at least the 
sinusoidal component of the input voice signal, the original attribute data being 
characteristic of the input voice signal and containing amplitude data representing an 
amplitude of the input voice signal in the form of static amplitude representing a basic 
variation of the amplitude and vibrato-like amplitude data representing a minute 
variation of the amplitude, superposed on the basic variation of the amplitude, pitch 
data representing a pitch of the input voice signal, and spectral shape data representing 
a spectral shape of the input voice signal; 

a synthesizing device that synthesizes new attribute data based on both of the 
original attribute data derived from the input voice signal and target attribute data being 
characteristic of the target voice signal composed of a target sinusoidal component and 
a target residual component other than the sinusoidal component, the target attribute 
data being derived from at least the target sinusoidal component, and containing 
amplitude data representing an amplitude of the target voice signal in the form of static 
amplitude data representing a basic variation of the amplitude and vibrato-like 
amplitude data representing a minute variation of the amplitude, superposed on the 
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basic variation of the amplitude, pitch data representing a pitch of the target voice 
signal, and spectral shape data representing a spectral shape of the target voice signal^ 
the synthesizing device selecting the static amplitude data, the vibrato-like amplitude 
data, the pitch data and the spectral shape data from either of the original attribute data 
and the target attribute data so as to synthesize the new attribute data in the form of a 
combination of the selected static amplitude data, the selected vibrato-like amplitude 
data, the selected pitch data and the selected spectral shape data : and 

an output device that operates based on the new attribute data and either of the 
original residual component and the target residual component for producing the output 
voice signal. 

Claims 2-4 (cancelled), 

5. (original) The apparatus according to claim 1, wherein the synthesizing 
device operates based on both the original attribute data composed of a set of original 
attribute data elements and target attribute data composed of another set of target 
attribute data elements in correspondence with one another to define each 
corresponding pair of the original attribute data element and the target attribute data 
element, such that the synthesizing device selects one of the original attribute data 
element and the target attribute data element, such that the synthesizing device selects 
one of the original attribute data element and the target attribute data element from 
each corresponding pair for synthesizing the new attribute data composed of a set of 
new attribute data elements each selected from each corresponding pair. 

6. (original) The apparatus according to claim 1, wherein the synthesizing 
device operates based on both of the original attribute data composed of a set of 
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original attribute data elements and the target attribute data composed of another set of 
target attribute data elements in correspondence with one another to define each 
corresponding pair of the original attribute data element and the target attribute data 
element, such that the synthesizing device interpolates with one another the original 
attribute data element and the target attribute data element of each corresponding pair 
for synthesizing the new attribute data composed of a set of new attribute data 
elements each interpolated from each corresponding pair. 

7. (original) The apparatus according to claim 1 , further comprising a peripheral 
device that provides the target attribute data containing pitch data representing a pitch 
of the target voice signal at a standard key, and a key control device that operates with 
a user key different than the standard key is designated to the input voice signal for 
adjusting the pitch data according to a difference between the standard key and the 
^user key. 

8. (original) The apparatus according to claim 1 , further comprising a peripheral 
device that provides the target attribute data divided into a sequence of frames 
arranged at a standard tempo of the target voice signal, and a tempo control device that 
operates when a user tempo different from the standard tempo is designated to the 
input voice signal for adjusting the sequence of frames of the target attribute data 
according to a difference between the standard tempo and the user tempo, thereby 
enabling the synthesizing device to synthesize the new attribute data based on both the 
original attribute data and the target attribute data synchronously with each other at the 
user tempo designated to the input voice signal. 

9. (original) The apparatus according to claim 8, wherein the tempo control 



600660253v1 



4 



PATENT 
51270-245599 

device adjusts the sequence of the frames of the target attribute data according to a 
difference between the standard tempo and the user tempo, such that an additional 
frame of the target attribute data is filled into the sequence of the frames according to 
the difference between the standard tempo and the user tempo, such that an additional 
frame of the target attribute data is filled into the sequence of frames of the target 
attribute data by interpolation of the target attribute data so as to match with a 
sequence of frames of the original attribute data provided from the extracting device. 

10. (original) The apparatus according to claim 1, further comprising a 
synchronizing device that compares the target attribute data provided in the form of a 
first sequence of frames with the original attribute data provided in the form of a second 
sequence of frames so as to detect a false frame that is present in the second 
sequence but is absent from the first sequence, and that selects a dummy frame 
occurring around the false frame in the first sequence so as to compensate for the false 
frame, thereby synchronizing the first sequence containing the dummy frame to the 
second sequence containing the false frame. 

1 1 . (original) The apparatus according to claim 1 , wherein the synthesizing 
device modifies the new attribute data so that the output device produces the output 
voice signal based on the modified new attribute data. 

12. (original) The apparatus according to claim 1 , wherein the synthesizing 
device synthesizes additional attribute data in addition to the new attribute data so that 
the output device concurrently produces the output voice signal based on the new 
attribute data and an additional voice signal based on the additional attribute data in a 
different pitch than that of the output voice signal. 
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Claims 13-39 (cancelled). 

40. (currently amended) A method of converting an input voice signal into an 
output voice signal according to a target voice signal, the method comprising the steps 
of: 

providing the input voice signal composed of an original sinusoidal component 
and an original residual component other than the original sinusoidal component; 

extracting original attribute data from at least the sinusoidal component of the 
input voice signal, the original attribute data being characteristic of the input voice signal 
and containing amplitude data representing an amplitude of the input voice signal in the 
form of static amplitude data representing a basic variation of the amplitude and 
vibrato-like amplitude data representing a minute variation of the amplitude, 
superposed on the basic variation of the amplitude, pitch data representing a pitch of 
the input voice signal, and spectral shape data representing a spectral shape of the 
input voice signal; 

synthesizing new attribute data based on both of the original attribute data 
derived from the input voice signal and target attribute data being characteristic of the 
target voice signal composed of a target sinusoidal component and a target residual 
component other than the sinusoidal component, the target attribute data being derived 
from at least the target sinusoidal component, and containing amplitude data 
representing an amplitude of the target voice signal in the form of static amplitude data 
representing a basic variation of the amplitude and vibrato-like amplitude data 
representing a minute variation of the amplitude, superposed on the basic variation of 
the amplitude, pitch data representing a pitch of the target voice signal, and spectral 
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shape data representing a spectral shape of the target voice signal , a synthesizing 

device selectino the static amplitude data, the vibrato-like amplitude data, the pitch data 

and the spectral shape data from either of the original attribute data and the target 

attribute data so as to synthesize the new attribute in the form of a combination of the 

selected static amplitude data, the selected vibrato-like amplitude data, the selected 

pitch data and the selected spectral shape data : and 

producing the output voice signal based on new attribute data and either of the 
original residual component and the target residual component. 

Claims 41 - 49 (cancelled). 

50. (currently amended) A machine readable medium used in a computer 
machine having a CPU, the medium containing program instructions executable by the 
CPU to cause the computer machine for performing a process of converting an input 
voice signal into an output voice signal according to a target voice signal, the process 
comprising the steps of: 

providing the input voice signal composed of an original sinusoidal component 
and an original residual component other than the original sinusoidal component; 

extracting original attribute data from at least the sinusoidal component of the 
input voice signal, the original attribute data being characteristic of the input voice signal 
and containing amplitude data representing an amplitude of the input voice signal in the 
form of static amplitude data representing a basic variation of the amplitude and 
vibrato-like amplitude data representing a minute variation of the amplitude, 
superposed on the basic variation of the amplitude, pitch data representing a pitch of 
the input voice signal, and spectral shape data representing a spectral shape of the 
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input voice signal; 

synthesizing new attribute data based on both of the original attribute data 
derived from the input voice signal and target attribute data being characteristic of the 
target voice signal composed of a target sinusoidal component and a target residual 
component other than the sinusoidal component, the target attribute data being derived 
from at least the target sinusoidal component, and containing amplitude data 
representing an amplitude of the target voice signal in the form of static amplitude data 
representing a basic variation of the amplitude and vibrato-like amplitude data 
representing a minute variation of the amplitude, superposed on the basic variation of 
the amplitude, pitch data representing a pitch of the target voice signal, and spectral 
shape data representing a spectral shape of the target voice signal , a svnthesizing 
device selecting the static amplitude data, the vibrato-like amplitude data, the pitch data 
and the spectral shape data from either of the original attribute data and the target 
attribute data so as to synthesize the new attribute in the form of a combination of the 
selected static amplitude data, the selected vibrato-like amplitude data, the selected 
pitch data and the selected spectral shape data : and 

producing the output voice signal based on new attribute data and either of the 
original residual component and the target residual component. 

Claims 51 - 59 (cancelled). 

60. (currently amended) An apparatus for converting an input voice signal into 
an output voice signal according to a target voice signal, the apparatus comprising: 

an input device that provides the input voice signal composed of an original 
sinusoidal component and an original residual component other than the original 
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sinusoidal component; 

an extracting device that extracts original attribute data from at least the 
sinusoidal component of the input voice signal, the original attribute data being 
characteristic of the input voice signal and containing amplitude data representing an 
amplitude of the input voice signal, pitch data representing a pitch of the input voice 
signal in the form of static pitch data representing a basic variation of the pitch and 
vibrato-like pitch data representing a minute variation of the pitch, superimposed on the 
basic variation of the pitch, and spectral shape data representing a spectral shape of 
the input voice signal; 

a synthesizing device that synthesizes new attribute data based on both of the 
original attribute data derived from the input voice signal and target attribute data being 
characteristic of the target voice signal composed of a target sinusoidal component and 
a target residual component other than the sinusoidal component, the target attribute 
data being derived from at least the target sinusoidal component, and containing 
amplitude data representing an amplitude of the target voice signal, pitch data 
representing a pitch of the target voice signal in the form of static pitch data 
representing a basic variation of the pitch and vibrato-like pitch data representing a 
minute variation of the pitch, superposed on the basic variation of the pitch, and 
spectral shape data representing a spectral shape of the target voice signal , the 
svnthesizing device selecting the amplitude data, the static pitch data, the vibrato-like 
pitch data and the spectral shape data from either of the original attribute data and the 
target attribute data so as to svnthesize the new attribute data in the form of a 
combination of the selected amplitude data, the selected static pitch data, the selected 
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vibrator-like pitch data and the selected spectral shape data : and 

an output device that operates based on the new attribute data and either of the 
original residual component and the target residual component for producing the output 
voice signal. 

61 . (previously presented) The apparatus according to claim 60, wherein the 
synthesizing device operates based on both the original attribute data composed of a 
set of original attribute data elements and target attribute data composed of another set 
of target attribute data elements in correspondence with one another to define each 
corresponding pair of the original attribute data element and the target attribute data 
element, such that the synthesizing device selects one of the original attribute data 
element and the target attribute data element, such that the synthesizing device selects 
one of the original attribute data element and the target attribute data element from 
each corresponding pair for synthesizing the new attribute data composed of a set of 
new attribute data elements each selected from each corresponding pair. 

62. (previously presented) The apparatus according to claim 60, wherein the 
synthesizing device operates based on both of the original attribute data composed of a 
set of original attribute data elements and the target attribute data composed of another 
set of target attribute data elements in correspondence with one another to define each 
corresponding pair of the original attribute data element and the target attribute data 
element, such that the synthesizing device interpolates with one another the original 
attribute data element and the target attribute data element of each corresponding pair 
for synthesizing the new attribute data composed of a set of new attribute data 
elements each interpolated from each corresponding pair. 
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63. (previously presented) The apparatus according to claim 60, further 
comprising a peripheral device that provides the target attribute data containing pitch 
data representing a pitch of the target voice signal at a standard key, and a key control 
device that operates with a user key different than the standard key is designated to the 
input voice signal for adjusting the pitch data according to a difference between the 
standard key and the user key. 

64. (previously presented) The apparatus according to claim 60, further 
comprising a peripheral device that provides the target attribute data divided into a 
sequence of frames arranged at a standard tempo of the target voice signal, and a 
tempo control device that operates when a user tempo different from the standard 
tempo is designated to the input voice signal for adjusting the sequence of frames of 
the target attribute data according to a difference between the standard tempo and the 
user tempo, thereby enabling the synthesizing device to synthesize the new attribute 
data based on both the original attribute data and the target attribute data 
synchronously with each other at the user tempo designated to the input voice signal. 

65. (previously presented) The apparatus according to claim 64, wherein the 
tempo control device adjusts the sequence of the frames of the target attribute data 
according to a difference between the standard tempo and the user tempo, such that 
an additional frame of the target attribute data is filled into the sequence of the frames 
according to the difference between the standard tempo and the user tempo, such that 
an additional frame of the target attribute data is filled into the sequence of frames of 
the target attribute data by interpolation of the target attribute data so as to match with a 
sequence of frames of the original attribute data provided from the extracting device. 
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66. (previously presented) The apparatus according to claim 60, further 
comprising a synchronizing device that compares the target attribute data provided in 
the form of a first sequence of frames with the original attribute data provided in the 
form of a second sequence of frames so as to detect a false frame that is present in the 
second sequence but is absent from the first sequence, and that selects a dummy 
frame occurring around the false frame in the first sequence so as to compensate for 
the false frame, thereby synchronizing the first sequence containing the dummy frame 
to the second sequence containing the false frame. 

67. (previously presented) The apparatus according to claim 60, wherein the 
synthesizing device modifies the new attribute data so that the output device produces 
the output voice signal based on the modified new attribute data. 

68. (previously presented) The apparatus according to claim 60, wherein the 
synthesizing device synthesizes additional attribute data in addition to the new attribute 
data so that the output device concurrently produces the output voice signal based on 
the new attribute data and an additional voice signal based on the additional attribute 
data in a different pitch than that of the output voice signal. 

69. (currently amended) A method of converting an input voice signal into an 
output voice signal according to a target voice signal, the method comprising the steps 
of: 

providing the input voice signal composed of an original sinusoidal component 
and an original residual component other than the original sinusoidal component; 

extracting original attribute data from at least the sinusoidal component of the 
input voice signal, the original attribute data being characteristic of the input voice signal 



600660253V 1 



12 



PATENT 
51270-245599 

and containing amplitude data representing an amplitude of the input voice signal, pitch 
data representing a pitch of the input voice signal in the form of static pitch data 
representing a basic variation of the pitch and vibrato-like pitch data representing a 
minute variation of the pitch, superimposed on the basic variation of the pitch, and 
spectral shape data representing a spectral shape of the input voice signal; 

synthesizing new attribute data based on, both of the original attribute data 
derived from the input voice signal and target attribute data being characteristic of the 
target voice signal composed of a target sinusoidal component and a target residual 
component other than the sinusoidal component, the target attribute data being derived 
from at least the target sinusoidal component, and containing amplitude data 
representing an amplitude of the target voice signal, pitch data representing a pitch of 
the target voice signal in the form of static pitch data representing a basic variation of 
the pitch and vibrato-like pitch data representing a minute variation of the pitch, 
superposed on the basic variation of the pitch, and spectral shape data representing a 
spectral shape of the target voice signal , a synthesizing device selecting the static 
amplitude data, the vibrato-like amplitude data, the pitch data and the spectral shape 
data from either of the original attribute data and the target attribute data so as to 
synthesize the new attribute in the form of a combination of the selected static 
amplitude data, the selected vibrato-like amplitude data, the selected pitch data and the 
selected spectral shape data : and 

operating based on the new attribute data and either of the original residual 
component and the target residual component for producing the output voice signal. 

70. (currently amended) A machine readable medium used in a computer 
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machine having a CPU, the medium containing program instructions executable by the 
CPU to cause the computer machine for performing a process of converting an input 
voice signal into an output voice signal according to a target voice signal, the process 
comprising the steps of: 

providing the input voice signal composed of an original sinusoidal component 
and an original residual component other than the original sinusoidal component; 

extracting original attribute data from at least the sinusoidal component of the 
input voice signal, the original attribute data being characteristic of the input voice signal 
and containing amplitude data representing an amplitude of the input voice signal, pitch 
data representing a pitch of the input voice signal in the form of static pitch data 
representing a basic variation of the pitch and vibrato-like pitch data representing a 
minute variation of the pitch, superimposed on the basic variation of the pitch, and 
-spectral shape data representing a spectral shape of the input voice signal; 

synthesizing new attribute data based on both of the original attribute data 
derived from the input voice signal and target attribute data being characteristic of the 
target voice signal composed of a target sinusoidal component and a target residual 
component other than the sinusoidal component, the target attribute data being derived 
from at least the target sinusoidal component, and containing amplitude data 
representing an amplitude of the target voice signal, pitch data representing a pitch of 
the target voice signal in the form of static pitch data representing a basic variation of 
the pitch and vibrato-like pitch data representing a minute variation of the pitch, 
superposed on the basic variation of the pitch, and spectral shape data representing a 
spectral shape of the target voice signal , a svnthesizina device selecting the static 
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amplitude data, the vibrato-like amplitude data, the pitch data and the spectral shape 
data from either of the original attribute data and the target attribute data so as to 
synthesize the new attribute in the form of a combination of the selected static 
amplitude data, the selected vibrato-like amplitude data, the selected pitch data and the 
selected spectral shape data : and 

operating based on the new attribute data and either of the original residual 
component and the target residual component for producing the output voice signal. 
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